Frequency-warping in speech
نویسندگان
چکیده
In this paper we present results that indicate that the formant frequencies between di erent speakers scale di erently at di erent frequencies. Based on our experiments on speech data, we then numerically compute a universal frequencywarping function, to make the scale-factor independent of frequency in the warped domain. The proposed warping function is found to be similar to the mel-scale, which has previously been derived from purely psycho-acoustic experiments. The motivation for the present experiments stems from our recently proposed use of scale-transform based cepstral coe cients [6] as acoustic features, since they provide superior separability of vowels than mel-cepstral coe cients.
منابع مشابه
Robot Arm Performing Writing through Speech Recognition Using Dynamic Time Warping Algorithm
This paper aims to develop a writing robot by recognizing the speech signal from the user. The robot arm constructed mainly for the disabled people who can’t perform writing on their own. Here, dynamic time warping (DTW) algorithm is used to recognize the speech signal from the user. The action performed by the robot arm in the environment is done by reducing the redundancy which frequently fac...
متن کاملتخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت
The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...
متن کاملFormant-based frequency warping for improving speaker adaptation in HMM TTS
Vocal Tract Length Normalization (VLTN), usually implemented as a frequency warping procedure (e.g. bilinear transformation), has been used successfully to adapt the spectral characteristics to a target speaker in speech recognition. In this study we exploit the same concept of frequency warping but concentrate explicitly on mapping the first four formant frequencies of 5 long vowels from sourc...
متن کاملA Frequency Warping Approach To Speaker Normalization - Speech and Audio Processing, IEEE Transactions on
In an effort to reduce the degradation in speech recognition performance caused by variation in vocal tract shape among speakers, a frequency warping approach to speaker normalization is investigated. A set of low complexity, maximum likelihood based frequency warping procedures have been applied to speaker normalization for a telephone based connected digit recognition task. This paper present...
متن کاملSpeaker normalized acoustic modeling based on 3-D Viterbi decoding
This paper describes a novel method for speaker normalization based on a frequency warping approach to reduce variations due to speaker-induced factors such as the vocal tract length. In our approach, a speaker normalized acoustic model is trained using time-varying (i.e., state, phoneme or word dependent) warping factors, while in the conventional approaches, the frequency warping factor is xe...
متن کاملLow-Dimensional Representation of Spectral Envelope Without Deterioration for Full-Band Speech Analysis/Synthesis System
A speech coding for a full-band speech analysis/synthesis system is described. In this work, full-band speech is defined as speech with a sampling frequency above 40 kHz, whose Nyquist frequency covers the audible frequency range. In prior works, speech coding has generally focused on the narrowband speech with a sampling frequency below 16 kHz. On the other hand, statistical parametric speech ...
متن کامل